Evolving Networks with Multi-species Nodes 
and Spread in the Number of Initial Links 
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We consider models for growing networks incorporating two efTects not previously considered: (i) 
different species of nodes, with each species having different properties (such as different attachment 
probabilities to other node species); and (ii) when a new node is born, its number of links to old nodes 
is random with a given probability distribution. Our numerical simulations show good agreement 
with analytic solutions. As an application of our model, we investigate the movie-actor network 
with movies considered as nodes and actors as links. 
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PACS numbers: 05.10.-a, 05.45.Pq, 02.50.Cw, 87.23. Ge 



I. INTRODUCTION 

It is known that many evolving network systems, in- 
cluding the world wide web, as well as social, biological, 
and communication systems, show power law distribu- 
tions. In particular, the number of nodes with k links is 
often observed to be Uk ~ k~'^ , where v typically varies 
from 2.0 to 3.1 Q . The mechanism for power-law network 
scaling was addressed in a seminal paper by Barabasi and 
Albert (BA) who proposed |^ a simple growing network 
model in which the probability of a new node forming 
a link with an old node (the "attachment probability") 
is proportional to the number of links of the old node. 
This model yields a power law distribution of links with 
exponent v = S. Many other works have been done ex- 
tending this the model. For example Krapivsky and Red- 
ner provide a comprehensive description for a model 
with more general dependence of the attachment proba- 
bility on the number k of old node links. For attachment 
probability proportional to Ak — ak + b they found that, 
depending on b/a, the exponent can vary from 2 to oo. 
Furthermore, for Ak ^ k", when a < I, Uk decays faster 
than a power law, while when a > 1, there emerges a sin- 
gle node which connects to nearly all other nodes. Other 
modifications of the model are the introduction of aging 
of nodes Q , initial attractiveness of nodes , the addi- 
tion or re- wiring of links ||^, the assignment of weights 
to links 0, etc. 

We have attempted to construct more general grow- 
ing network models featuring two effects which have not 
been considered previously: (i) multiple species of nodes 
[in real network systems, there may be different species of 
nodes with each species having different properties (e.g., 
each species may have different probabilities for adding 
new nodes and may also have different attachment prob- 
abilities to the same node species and to other node 
species, etc.)]. (ii) initial link distributions [i.e., when 
a new node is born, its number of links to old nodes is 



not necessarily a constant number, but, rather, is char- 
acterized by a given probability distribution pk of new 
links] . 

As an application of our model, we investigate the 
movie-actor network with movies considered as nodes 
and actors as links {i.e., if the same actor appears in 
two movies there is a link between the two movies Q). 
Moreover, we consider theatrical movies and made-for- 
television movies to constitute two different species. 



II. MODEL 

We construct a growing network model which incorpo- 
rates multiple species and initial link probabilities. Given 
an initial network, we create new nodes at a constant 
rate. We let the new node belong to species j with prob- 
abihty Q(^) (Y,^ Q^''' = !)• We decide how many links I 
the new node establishes with already existing nodes by 
randomly choosing I from a probability distribution p] . 
Then, we randomly attach the new node to / existing 
nodes with preferential attachment probability propor- 
tional to a factor A^^''^'^ , where k is the number of links 
of the target node of species i to which the new node of 
species j may connect. That is, the connection proba- 
bility between an existing node and a new node is deter- 
mined by the number of links of the existing node and 
the species of the new node and the target node. 

As for the single species case the evolution of this 
model can be described by rate equations. In our case 



the rate equations give the evolution of N^, 
of species i nodes that have k links. 
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where S is the total number of species and fc^-'-' = Ipi 
is the average number of new hnks to a new node of 
species j, and t is normahzed so that the rate of creation 
of new nodes is one per unit time. The term propor- 
tional to Aj:^lN^_-^ accounts for the increase of Nj^ due 
to the addition of a new node of species j that hnks to 
a species i node with k — 1 connections. The term pro- 
portional to A]:' ' Nj^ accounts for the decrease of Nj, 
due to hnking of a new species j node with an exist- 
ing species i node with k connections. The denominator, 
Sfc A^i^'"^^ N^"^\ is a normahzation factor. If we add 
a new node with I initial links, we have I chances of in- 
creasing/decreasing Nj^^\ This is accounted for by the 
factor fc^-') = /pp"* appearing in the summand of Eq. 

(|l|). The last term, Q^^V^ ^ ! accounts for the introduction 
of new nodes of species i. Since all nodes have at least 
one link, A^^'^ = 0. 



choose Ak = k + c. (Note that by Eq. (|l|) this is 
equivalent to Ak = ak + b with c — b/a.) Inserting this 

Ak into Eq. (H), we obtain J2k(^ + ~ ^{^) ^ 

and Bk = (k + c)/ri, where t] = (2{|fc^ + cQ)/{Qk) = 

2 + c/k > 2. (Note that(|fc^ = Qk for the single species 
case.) Thus Eq. (||) yields 

[(fc + cjuk - (fc + c - + TjUk = TjQpk- (7) 

Setting pk = pi(fc + c)^'', we can solve Eq. (0) for large k 
by approximating the discrete variable k as continuous, 
so that 

(fc + c)nk -{k + c- l)nk-i ^ ^[{k + c)nk] . (8) 
Solution of the resulting differential equation. 



III. ANALYSIS OF THE MODEL 

Equation implies that total number of nodes and 
total number of links increase at fixed rates. The total 
number of nodes of species i increases at the rate Q*^*^. 
Thus 



(2) 



The link summation over all species J2i J2k ^■^if' twice 
the total number of links in the network. Thus 



5:5:fciv«=.2(fc)t, 



i k 



(3) 



where (k^ = Y.^Y.kQ^'^kpf = ^,,Q(')fcW. Solutions 

of (^ occur in the form(c./., ^ for the case of single 
species nodes). 



k 



where is independent of t. Eq. (j^) yields 



(bIP + 1) 



(4) 
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where B^. is 



VQ(^")fcW ^ 

j=l l^m l^k 



(6) 



To most simply illustrate the effect of spread in the 
initial number of links, we first consider the case of a 
network with a single species of node and with a simple 
form for the attachment Ak = A^^'^\ In particular, we 
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[(fc + c)nk] + r/Hk = riQpi{k + c)' 



(9) 



for Uk with (3^1]+! consists of a homogeneous solution 
proportional to (fc + c)^^''+^^ plus the particular solution, 
[vQPi/iv +1- l3)]{k + cy^. For /? = 77 + 1 the solution 
is Uk = 'r]Qpi{k + c)^^"^^^^ \n[d{k + c)], where d is an 
arbitrary constant. Hence, for sufficiently large k we have 
rife ~ fc-(''+i) if /3 > ?7 + 1, and nk ~ fc"^ if /3 < + 1. 
Thus the result for /3 > + 1 is independent of (3 and, 
for c = 0, coincides with that give in Ref. Q (j^ + 1 = 3 
when c = 0). Solutions of Eq. (^ for Uk versus fc in the 
range 1 < fc < 10'' arc shown as open circles in Fig. |^(a) 
for initial link probabilities of the form 



pifc^i for 1 < fc < 10^ 

^'^ ~ y pil02(^-i)fc-'3 for fc > 102, 



(10) 



which are plotted as solid lines in Fig. |i|(a). The values of 
P used for the figure are /3 = 0.5, 1, 2, 3, 4, and oo {p — cg 
corresponds to = for fc > 10^). For clarity nk has 
been shifted by a constant factor so that rii coincides 
with the corresponding value of pi. Also, to separate the 
graphs for easier visual inspection, the value of pi for 
successive fi values is changed by a constant factor [since 
(^) is linear, the form of the solution is not effected]. We 
note from Fig. [l](a) that Uk follows pk for fc < 10^ in all 
cases. This is as expected, since pk decreases slower than 
fc~'^ in this range. Furthermore, very closely follows pk 
for fc > 10^ for (3 = 0.5, 1.0, 2.0. As (3 increases deviations 
of Uk from pfe in fc > 10^ become more evident, and the 
large fc asymptotic fc^'^ dependence is observed. Thus, if 
Pk decreases sufficiently rapidly, then the behavior of Uk 
is determined by the growing network dynamics, while, if 
Pk decreases slowly, then the behavior of rik is determined 
by Pk- 

To simply illustrate the effect of multiple species we 
now consider a growing two species network with pk = 
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FIG. 1: (a) rik and pk versus k for the single species network 
model. Solid lines are the initial link probability pk and cir- 
cles are the nt obtained from Eq. d^. (b) n^^' and n^'^' versus 
k for the two species network model. Circles (species 1) and 
crosses (species 2) are log-binned data from our numerical 
simulation. The total number of nodes in our numerical net- 
work system is 10^. The dashed lines are solutions obtained 
from (H) and (|l|). 



(5i_fc {i.e., Pk — for k > 2). Then, Eq. m) becomes 
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node can link to species 1 nodes only. Thus, the increase 
of species 1 links is (1 -I- 7)Q*-^^ + Q^^-* and that of species 
2 links is (1 - -i)Q^'^^ + Q^"^^ ■ Since 7 is the ratio of the 
number of species 1 links to the total number of links, 
7= [(l + 7)Q(i)+Q(2)]/2 or 
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With this 7, Eq. ( pT| ) becomes 
QW, ^ Q(2)(2 
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Q(2)(2-Q(i))] and r;(2) = 



where obtain 77^^^ — 2/[Q(^^ 
2/Q«. 

Proceeding as for the single species case, we approx- 
imate (^) by an ordinary differential equation (c./., Eq. 

(^) to obtain n^'^ ^ As an example, we set 

Q(l) ^ Q(2) ^ Q 5^ ^J^-^Jj 

case Eqs. ([130 give expo- 
nents l+T^^^) = 2.6 and 1 + 77^^^ = 5. In FigT^b) we plot, 
for this case, the analytic solution obtained from (H) and 
( |T3| ) as dashed lines, and the results of numerical simula- 
tions as open circles and pluses. The simulation results, 
obtained by histogram binning with uniform bin size in 
logfc, agree with the analytic solutions, and both show 
the expected large k power law behaviors. 



(lib) and 
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where represents summation of species 1 and 2 

nodes. 

In order to illustrate the model with our numerical 
simulations, we specialize to a specific case. We choose 



attachment coefficients A^^''^^ 



ak, A 



(1,2) 



ak, 4^'^) 



bk, and A*^^'^^ = 0. Thus a new species 1 node connects 
to either existing species 1 nodes and species 2 nodes with 
equal probability, while a new species 2 node can connect 
to existing species 1 nodes only. Therefore, the first sum- 
mation term in Eq. (pi]), X^m X^fe ^fe^'™^"^fe"^ becomes 
'^Xfc(^'^i^^ + ^'^fe^^)i which is a times the total increase 
of finks at each time a x 2{Q'^^^ + Q*^^^). Recall that 
Q^^^ = 1. In order to calculate the second summa- 

tion term in Eq. ^, Ek 4''"^.^^^ = b^k 
we define a parameter 7 that is the ratio of the total num- 
ber of links of species 1 to the total number of links in the 
network. Since the probability of linking a new species 
1 node to existing species 1 nodes is determined by the 
total number of links of species 1 , this probability is ex- 
actly same as 7. Thus, if we add a new species 1 node, 
the number of links of species 1 increases by Q^^^^ due to 
the new node and by 7<5^^^ due to the existing species 1 
nodes that become connected with the new node, while 
the number of links of species 2 increases by (1 — 7)(5'^-'. 
But, if we add a new species 2 node, the numbers of links 
increases by Q^^^ for both species because a new species 2 



IV. THE MOVIE- ACTOR NETWORK 

We now investigate the movie-actor network. We col- 
lected data from the Internet Movie Data Base (IMDB) 
web site The total number of movies is 285,297 and 
the total number of actors/actresses is 555,907. Within 
this database are 226,325 theatrical movies and 24,865 
made for television movies. The other movies in the 
database are made for television series, video, mini se- 
ries, and video games. In order to get good statistics, 
we choose only theatrical and television movies made be- 
tween 1950 to 2000. Thus we have two species of movies. 
We also consider only actors/ actresses from these movies. 
We consider two movies to be linked if they have an ac- 
tor/actress in common. We label the theatrical movies 
species 1, and the made for television movies species 2. 

In order to apply our model, Eq. (Q), we require as in- 
put Q'^^\vk'' and A^^''^ which we obtain from the movie- 
actor network data. We take Q^^' and Q^"^^ to be, re- 
spectively, the fractions of theatrical and made for tele- 
vision movies m our data base. We obtain = 0.83 
and Q(2) = 0.17. We now consider p'^\ Suppose a 
new movie is produced casting r actors. For each ac- 
tor s (s = 1, 2, r) let ?s denote the number of previous 
movies in which that actor appeared. Then the total 
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FIG. 2: The initial link probability distributions Ph of (a) 
theatrical movies and (b) television movies. These plots are 
obtained using bins of equal width in log k and dividing the 
number of nodes in each bin by the product of the bin width 
in k (which varies from bin to bin) and the total number of 
nodes. 
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number of the initial links of the new movie is 
From histograms of this number, we obtain (Figs. 



initial link probability distributions pf. 
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FIG. 3: Attachment coefficients for theatrical movies (a) 
A^i^'^^ and (b) A^i^'^\ and for television movies (c) A^^'^'' and 
(d) Aj^^'^'. All data are obtained using log-binning without 



be numerically obtained normalization (see caption to Fig. | 
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where A(j; i, fc) is the increase during a time interval St 
in the number of links between old species i nodes that 
had k links and new species j nodes, and < ... > is an 
average over all such species i nodes In the movie 

network, we count all movies and links from 1950 to 1999, 
and measure the increments in the number of links for a 



St of one year. We obtain attachment coefficient A\. 



(1,1) 



0.10fc°-59 and A\}''^'> 



l(2,l) 







0.04fc*''*^ for theatrical movies, and 
0.04fc° '"'' for television 



movies. See Fig. 

Incorporating these results for Q^^\ pf' and A^j!'^^ in 
our multi-species model, Eq. (^), we carry out numerical 
simulations as follows: (i) We add a new movie at each 
time step. We randomly designate each new movie as a 
theatrical movie with probability g(i) = 0.83 or a televi- 
sion movie with probability Q^^) = 0.17. (ii) With initial 

link probability p'^^\ we randomly choose the number of 
connections to make to old movies, (iii) We then use the 

attachment ^[f'*'' to randomly choose connections of new 
species j movie to old species i movies, (iv) We repeat 
(i)-(iii) adding 100,000 new movies, and finally calculate 
the probability distributions of movies with k links. 

Figure ^ shows n'^^ versus k obtained from our movie- 
actor network data base (dots) and from numerical sim- 
ulations using Eq.(^ (open circles) with our empirically 



ities decay much more rapidly. Indeed, the results of 
suggest that the decay should be exponential for large k 
since the attachment ^[f grow sub-linearly with k. We 
showed in Sec. Ill for the single species model with a 
linear attachment Ak ^ k that Uk follows pk when pk de- 
cays slowly, while n^. is independent oipk whenp^ decays 
sufficiently quickly. As we will later show, this feature is 
also applicable to multi-species networks with nonlinear 
attachments. As seen in Figs. |(a) and|(b), follows 
p^*' in the small k region. However, it is not clear whether 
rt^*' follows p^^^ in the large k region. In order to check 
the behavior of rS^^ in this region, we carried out another 
numerical simulation using an initial link probability p). 
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FIG. 4: The probability distributions nf^ of movies that have 
and A^'''^^ The results ^ links; (a) theatrical movies n)^^ and (b) television movies 



obtained results for Q'^^\ 
are are roughly consistent with the existence of two scal- 
ing regions |11 . For small k {k < 10^) the two species 



Dots are n^'' 

circles are from numerical simulation using Q'-*^ obtained from 



' . Dots are n\, obtained from the movie network while 



exhibit slow power law decay with diflFerent exponents, our data base, pjf' in Fig. ||and A^^''^ in Fig. |[ All data are 
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while for large k the probabil- obtained using log-binning (see caption to Fig. 
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FIG. 5: (a) and (b) are (circles) obtained from numerical 
simulations using p^*' (dashed lines), while (c) and (d) show 
n^'-* from (a) and (b) (open circles) plotted with results de- 
noted n^'' (filled circles) from simulation using a cutoff initial 
link probability p^'' (where p^*' = P^VX^Pfe' when fc < 50 
and p5c'' = when k > 50). All data are obtained using 
log-binning (see caption to Fig. 0). 



which is cut off at fc = 50. That is, p^!'' = Pk'/J2Pk 
when fc < 50 and p^*-* = when fc > 50. Using p^*'' in 
place of p^j^^ , we obtain from our simulation correspond- 
ing data, n^*' versus fc, which are shown in Figs. |^(c) and 

||(d) as filled in circles. For comparison the data for n^*-* 
from Figs, ^(a) and ^(b) are plotted in Figs, ^(c) and 
||(d) as open circles. It is seen that the cutoff at fc = 50 
induces a substantial change in the distribution of the 
number of links for fc > 50. Thus it appears that, in 
the range tested, the large k behavior of the movie-actor 

network is determined by the initial link probability p^. 
rather than by the dynamics of the growing network. 
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In conclusion, in this paper we propose a model for a 
multi-species network with variable initial link probabili- 
ties. We have investigated the movie-actor network as an 
example. We believe that the effect of multiple species 
nodes may be important for modeling other complicated 
networks {e.g., the world wide web can be divided into 
commercial sites and educational or personal sites). We 
also conjecture that the initial link probability is a key 
feature of many growing networks. 
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